Add sampling API back to LlamaTokenDataArray; Add DRY and XTC Samplers #594

nkoppel · 2024-12-04T16:46:09Z

The changes in 0.1.84 removed the ability to use samplers with LlamaTokenDataArrays. These changes add the sampling methods of version 0.1.83 back to LlamaTokenDataArray, adds support for the XTC and DRY samplers, and allows users to apply a LlamaSampler to a LlamaTokenDataArray.

These changes enable users to mix custom samplers with llama.cpp samplers, and makes the LlamaTokenDataArray struct useful again.

Let me know if you'd like to see any changes!

…exts

llama-cpp-2/src/token/data_array.rs

MarcusDunn

This looks good. I've got a couple of comments that don't require changes, but I would like your thoughts on before I merge.

nkoppel · 2024-12-07T19:31:18Z

I've gone ahead and overhauled the sampling API in a manner which resolves your third review comment, but which is very different from what came before. In my latest version, LlamaSamplers can be defined as either a chain or a single sampler as determined by LlamaSamplerParams, which is an enum struct that contains all parameters needed to construct a LlamaSampler. I used this to factor out all of the LlamaSampler::add_ and LlamaTokenDataArray::sample_ methods in favor of specifying which sampler to use using the enum. This deduplicates a lot of code and does not make the interface much more clunky than it was.

If you think that this API is too far from the raw API or complicates things too much, let me know.

MarcusDunn

I think there's a bit of thinking to do around lifetimes before this gets merged.

If you want to remove the samplers that require lifetime logic, it would be nice to get a first iteration of the trivial samplers in, then we can add the more complex ones in a separate PR. There's a lot of great work here I'd hate to see not in main because we're battling encoding model lifetimes guarantees into the sampler.

llama-cpp-2/src/sampling.rs

MarcusDunn · 2024-12-08T15:59:57Z

I don't mind the new API. I weakly preferred the "closer to direct calls" one.

MarcusDunn · 2024-12-08T16:01:09Z

I think a LlamaSampler struct with a collection of factory functions would be preferable to the params + enum approach. This would also allow different lifetime signatures for different samplers (have most return LlamaSampler<'static> if it owns everything, or LlamaSampler<'a> if it takes a reference to a model for example)

nkoppel · 2024-12-08T20:04:56Z

Okay, I've rewritten the API again, and I think that this is about as simple as it will get while still allowing LlamaSampler to be either a chain or a single sampler. It is also about as close to the raw api as it was before this pr. I will write documentation for the new methods on LlamaSampler later tody.

llama-cpp-2/src/sampling.rs

llama-cpp-2/src/token/data_array.rs

MarcusDunn

minor nitpicks. This looks good - address / dismiss as you sit fit and once you're happy I will merge long as tests pass.

I think this is also one of the parts of the library that can be reasonably tested (stuff that does inference or loads a model is too hard), a couple tests (preferably in docs) would be lovely.

Thanks for the great PR and patience with the back and forth.

…r LlamaToken

nkoppel added 6 commits December 4, 2024 10:28

Add sampling API back to LlamaTokenDataArray

fc3c3b5

Add convience methods for getting LlamaTokenDataArrays from LlamaCont…

25c8e1d

…exts

Merge branch 'utilityai:main' into sampler_api

83048a6

Run cargo fmt

d61858a

Merge branch 'utilityai:main' into sampler_api

734d281

Small documentation improvement

6d3dec9

MarcusDunn reviewed Dec 7, 2024

View reviewed changes

llama-cpp-2/src/token/data_array.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Dec 7, 2024

View reviewed changes

llama-cpp-2/src/token/data_array.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Dec 7, 2024

View reviewed changes

llama-cpp-2/src/token/data_array.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Dec 7, 2024

View reviewed changes

nkoppel added 3 commits December 7, 2024 10:39

Make LlamaTokenDataArray::selected an Option<usize>

ca07170

Overhaul sampling API

27ebd82

Add Mirostat to new API

d44deef

Remove unused imports

4a334a4

nkoppel requested a review from MarcusDunn December 7, 2024 20:32

Fix crash when running XTC sampler

32cadf7

MarcusDunn requested changes Dec 8, 2024

View reviewed changes

llama-cpp-2/src/sampling.rs Outdated Show resolved Hide resolved

llama-cpp-2/src/sampling.rs Show resolved Hide resolved

nkoppel added 3 commits December 8, 2024 13:51

Yet another API overhaul

73ef067

Small tweaks

bacad65

Generalize the arguments to LlamaSampler::chain

95c2c87

MarcusDunn reviewed Dec 9, 2024

View reviewed changes

llama-cpp-2/src/sampling.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Dec 9, 2024

View reviewed changes

llama-cpp-2/src/sampling.rs Outdated Show resolved Hide resolved

MarcusDunn reviewed Dec 9, 2024

View reviewed changes

llama-cpp-2/src/sampling.rs Show resolved Hide resolved

MarcusDunn reviewed Dec 9, 2024

View reviewed changes

llama-cpp-2/src/token/data_array.rs Show resolved Hide resolved

MarcusDunn approved these changes Dec 9, 2024

View reviewed changes

Generalize the arguments to accept_many and with_tokens

76fd776

nkoppel added 4 commits December 8, 2024 20:35

Documentation for sampler creation methods

8096e79

Merge branch 'main' into sampler_api

7aa4367

Add LlamaTokenDataArray::with_sampler; use Borrow instead of AsRef fo…

aeb76dc

…r LlamaToken

Test for LlamaSampler::chain_simple

67ea688

MarcusDunn merged commit cf69db5 into utilityai:main Dec 9, 2024
2 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sampling API back to LlamaTokenDataArray; Add DRY and XTC Samplers #594

Add sampling API back to LlamaTokenDataArray; Add DRY and XTC Samplers #594

nkoppel commented Dec 4, 2024

MarcusDunn left a comment

nkoppel commented Dec 7, 2024

MarcusDunn left a comment

MarcusDunn commented Dec 8, 2024

MarcusDunn commented Dec 8, 2024 •

edited

Loading

nkoppel commented Dec 8, 2024

MarcusDunn left a comment

Add sampling API back to LlamaTokenDataArray; Add DRY and XTC Samplers #594

Add sampling API back to LlamaTokenDataArray; Add DRY and XTC Samplers #594

Conversation

nkoppel commented Dec 4, 2024

MarcusDunn left a comment

Choose a reason for hiding this comment

nkoppel commented Dec 7, 2024

MarcusDunn left a comment

Choose a reason for hiding this comment

MarcusDunn commented Dec 8, 2024

MarcusDunn commented Dec 8, 2024 • edited Loading

nkoppel commented Dec 8, 2024

MarcusDunn left a comment

Choose a reason for hiding this comment

MarcusDunn commented Dec 8, 2024 •

edited

Loading